Validation of language resources in TC-STAR

نویسندگان

  • Henk van den Heuvel
  • Eric Sanders
چکیده

In TC-STAR a variety of Language Resources (LR) are being produced. In this contribution we address the validation of resources that were created and used for the second Evaluation Campaign of the project. For the three types of topics covered by the project (ASR, SLT, TTS) the validation of both development and evaluation sets is described. For each type we successively address the description of the data, the validation procedures and the validation results. It is concluded that validation constitutes an important and useful element in the production of high quality TC-STAR language resources.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TC-STAR: New language resources for ASR and SLT purposes

In TC-STAR a variety of Language Resources (LR) is being produced. In this contribution we address the resources that have been created for Automatic Speech Recrognition and Spoken Language Translation. As yet, these are 14 LR in total: two training SLR for ASR (English and Spanish), three development LR and three evaluation LR for ASR (English, Spanish, Mandarin), and three development LR and ...

متن کامل

Development, Factor Analysis, and Validation of an EFL Teacher Change Scale (TCS)

The concept of teacher change is critical in second language teaching and English as a Foreign Language (EFL) context due largely to the fact that, almost, whatever we do in teacher education looks for initiating change of one sort or another. A substantial body of research has been dedicated to investigate teacher change (TC) from various perspectives.  However, having studied the related lite...

متن کامل

TC-STAR: Specifications of Language Resources and Evaluation for Speech Synthesis

In the framework of the EU funded project TC-STAR (Technology and Corpora for Speech to Speech Translation), research on TTS aims on providing a synthesized voice sounding like the source speaker speaking the target language. To progress in this direction, research is focused on naturalness, intelligibility, expressivity and voice conversion both, in the TC-STAR framework. For this purpose, spe...

متن کامل

Creating Slovenian Language Resources for Development of Speech-to-speech Translation Components

Article brings detailed information about procedures of building Slovenian lexica within the LC-STAR project, and also detailed information about the size of that lexica. University of Maribor joined the LC-STAR project in order to provide appropriate language resources for developing speech-to-speech translation technology for Slovenian language. Lexica exists from three parts: 65.000 common w...

متن کامل

Exploring XML-based technologies and procedures for quality evaluation from a real-life case perspective

The use of Extensible Markup Language (XML) for the annotation of Spoken Language Resources (SLR) is becoming increasingly common these days. Therefore the Speech Processing EXpertise centre (SPEX), which is the SLR validation centre of the European Language Resources Association (ELRA), is also being confronted more with XML. The project “Lexica and Corpora for Speech-to-Speech Translation Com...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006